Identifying Cross-Document Relations between Sentences

نویسندگان

  • Yasunari Miyabe
  • Hiroya Takamura
  • Manabu Okumura
چکیده

A pair of sentences in different newspaper articles on an event can have one of several relations. Of these, we have focused on two, i.e., equivalence and transition. Equivalence is the relation between two sentences that have the same information on an event. Transition is the relation between two sentences that have the same information except for values of numeric attributes. We propose methods of identifying these relations. We first split a dataset consisting of pairs of sentences into clusters according to their similarities, and then construct a classifier for each cluster to identify equivalence relations. We also adopt a “coarse-to-fine” approach. We further propose using the identified equivalence relations to address the task of identifying transition relations.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Key Sentence Extraction from Single Document based on Triangle Analysis in Dependency Graph

Document summarization is a technique aimed to automatically extract main ideas from electronic documents. In this paper, we propose a novel algorithm, called TriangleSum for key sentence extraction from single document based on graph theory. The algorithm builds a dependency graph for the underlying document based on co-occurrence relation as well as syntactic dependency relations. The nodes r...

متن کامل

The Future of Multilingual Summarization: Beyond Sentence Extraction

In this paper I present a vision for the future of multilingual summarization that focuses on summarizing differences between documents: generating sentences that explain the main points of controversy in the document set, identifying different sides in the dialogue and the claims they support, and identifying how content differs across document boundaries (cultural, national, political, etc.)....

متن کامل

Graph-Based Approach to Recognizing CST Relations in Polish Texts

This paper presents a supervised approach to the recognition of Cross-document Structure Theory (CST) relations in Polish texts. In the proposed, graph-based representation is constructed for sentences. Graphs are built on the basis of lexicalised syntactic-semantic relations extracted from text. Similarity between sentences is calculated on their graphs, and the values are used as features to ...

متن کامل

“How-to” Questions Answering Using Relations-based Summarization

Submitted: Aug 10, 2013; Accepted: Sep 21, 2013; Published: Sep 25, 2013 Abstract: The problem considered in this paper relates to searching for "How-to" question answers and identifying the main semantic part in the answers found. We propose a “bag-of-relations” method for document summarization. This approach consists in identifying sentences that correspond to the key relations the most. Rat...

متن کامل

Single Document Summarization with Document Expansion

Existing methods for single document summarization usually make use of only the information contained in the specified document. This paper proposes the technique of document expansion to provide more knowledge to help single document summarization. A specified document is expanded to a small document set by adding a few neighbor documents close to the document, and then the graphranking based ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008